Looking beyond appearances: Synthetic training data for deep CNNs in re-identification

نویسندگان

  • Igor Barros Barbosa
  • Marco Cristani
  • Barbara Caputo
  • Aleksander Rognhaugen
  • Theoharis Theoharis
چکیده

Re-identification is generally carried out by encoding the appearance of a subject in terms of outfit, suggesting scenarios where people do not change their attire. In this paper we overcome this restriction, by proposing a framework based on a deep convolutional neural network, SOMAnet, that additionally models other discriminative aspects, namely, structural attributes of the human figure (e.g. height, obesity, gender). Our method is unique in many respects. First, SOMAnet is based on the Inception architecture, departing from the usual siamese framework. This spares expensive data preparation (pairing images across cameras) and allows the understanding of what the network learned. Second, and most notably, the training data consists of a synthetic 100K instance dataset, SOMAset, created by photorealistic human body generation software. Synthetic data represents a good compromise between realistic imagery, usually not required in re-identification since surveillance cameras capture low-resolution silhouettes, and complete control of the samples, which is useful in order to customize the data w.r.t. the surveillance scenario at-hand, e.g. ethnicity. SOMAnet, trained on SOMAset and fine-tuned on recent re-identification benchmarks, outperforms all competitors, matching subjects even with different apparel. The combination of synthetic data with Inception architectures opens up new research avenues in re-identification.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Learning for Target Classification from SAR Imagery: Data Augmentation and Translation Invariance

This report deals with translation invariance of convolutional neural networks (CNNs) for automatic target recognition (ATR) from synthetic aperture radar (SAR) imagery. In particular, the translation invariance of CNNs for SAR ATR represents the robustness against misalignment of target chips extracted from SAR images. To understand the translation invariance of the CNNs, we trained CNNs which...

متن کامل

Deep Person Re-Identification with Improved Embedding

Person re-identification task has been greatly boosted by deep convolutional neural networks (CNNs) in recent years. The core of which is to enlarge the inter-class distinction as well as reduce the intra-class variance. However, to achieve this, existing deep models prefer to adopt image pairs or triplets to form verification loss, which is inefficient and unstable since the number of training...

متن کامل

Object Detection Using Deep CNNs Trained on Synthetic Images

The need for large annotated image datasets for training Convolutional Neural Networks (CNNs) has been a significant impediment for their adoption in computer vision applications. We show that with transfer learning an effective object detector can be trained almost entirely on synthetically rendered datasets. We apply this strategy for detecting packaged food products clustered in refrigerator...

متن کامل

Convolutional Gating Network for Object Tracking

Object tracking through multiple cameras is a popular research topic in security and surveillance systems especially when human objects are the target. However, occlusion is one of the challenging problems for the tracking process. This paper proposes a multiple-camera-based cooperative tracking method to overcome the occlusion problem.  The paper presents a new model for combining convolutiona...

متن کامل

Generating High Quality Visible Images from SAR Images Using CNNs

We propose a novel approach for generating high quality visible-like images from Synthetic Aperture Radar (SAR) images using Deep Convolutional Generative Adversarial Network (GAN) architectures. The proposed approach is based on a cascaded network of convolutional neural nets (CNNs) for despeckling and image colorization. The cascaded structure results in faster convergence during training and...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computer Vision and Image Understanding

دوره 167  شماره 

صفحات  -

تاریخ انتشار 2018